3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation

نویسندگان

Angela Dai

Matthias Niessner

چکیده

We present 3DMV, a novel method for 3D semantic scene segmentation of RGB-D scans in indoor environments using a joint 3Dmulti-view prediction network. In contrast to existing methods that either use geometry or RGB data as input for this task, we combine both data modalities in a joint, end-to-end network architecture. Rather than simply projecting color data into a volumetric grid and operating solely in 3D – which would result in insufficient detail – we first extract feature maps from associated RGB images. These features are then mapped into the volumetric feature grid of a 3D network using a differentiable backprojection layer. Since our target is 3D scanning scenarios with possibly many frames, we use a multi-view pooling approach in order to handle a varying number of RGB input views. This learned combination of RGB and geometric features with our joint 2D-3D architecture achieves significantly better results than existing baselines. For instance, our final result on the ScanNet 3D segmentation benchmark [1] increases from 52.8% to 75% accuracy compared to existing volumetric architectures. Corresponding author: [email protected] ar X iv :1 80 3. 10 40 9v 1 [ cs .C V ] 2 8 M ar 2 01 8 2 A. Dai and M. Nießner

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian non-parametrics for multi-modal segmentation

Segmentation is a fundamental and core problem in computer vision research which has applications in many tasks, such as object recognition, content-based image retrieval, and semantic labelling. To partition the data into groups coherent in one or more characteristics such as semantic classes, is often a first step towards understanding the content of data. As information in the real world is ...

متن کامل

Cascaded Scene Flow Prediction using Semantic Segmentation

Given two consecutive frames from a pair of stereo cameras, 3D scene flow methods simultaneously estimate the 3D geometry and motion of the observed scene. Many existing approaches use superpixels for regularization, but may predict inconsistent shapes and motions inside rigidly moving objects. We instead assume that scenes consist of foreground objects rigidly moving in front of a static backg...

متن کامل

Joint Semantic Segmentation and 3D Reconstruction from Monocular Video

We present an approach for joint inference of 3D scene structure and semantic labeling for monocular video. Starting with monocular image stream, our framework produces a 3D volumetric semantic + occupancy map, which is much more useful than a series of 2D semantic label images or a sparse point cloud produced by traditional semantic segmentation and Structure from Motion(SfM) pipelines respect...

متن کامل

Developing a BIM-based Spatial Ontology for Semantic Querying of 3D Property Information

With the growing dominance of complex and multi-level urban structures, current cadastral systems, which are often developed based on 2D representations, are not capable of providing unambiguous spatial information about urban properties. Therefore, the concept of 3D cadastre is proposed to support 3D digital representation of land and properties and facilitate the communication of legal owners...

متن کامل

Zhile Ren | Research Statement

Figure 1: COG descriptor encodes orientation-invariant gradient feature for objects with different views. I develop new representations and algorithms for three-dimensional (3D) scene understanding from cluttered indoor RGB-D images and outdoor video sequences. I introduce novel representations for 3D object detection systems that localize objects with cuboids and describe room layouts by Manha...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2018

3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation

نویسندگان

چکیده

منابع مشابه

Bayesian non-parametrics for multi-modal segmentation

Cascaded Scene Flow Prediction using Semantic Segmentation

Joint Semantic Segmentation and 3D Reconstruction from Monocular Video

Developing a BIM-based Spatial Ontology for Semantic Querying of 3D Property Information

Zhile Ren | Research Statement

عنوان ژورنال:

اشتراک گذاری